Self-Organizing Maps in data analysis - notes on overfitting and overinterpretation
نویسندگان
چکیده
The Self-Organizing Map, SOM, is a widely used tool in exploratory data analysis. Visual inspection of the SOM can be used to list potential dependencies between variables, that are then validated with more principled statistical methods. In this paper we discuss the use of the SOM in searc hing for dependencies in the data. We poin t out that simple use of the SOM may lead to excessive number of false h ypotheses.We formulate the exact probability densit y model for which the SOM training gives the Maximum Likelihood estimate and show how the model parameters (neighborhood and kernel width) can be chosen to avoid o ver tting.The conditional distributions from the true densit ymodel o er a consisten t w ay to quantify and test the dependencies between variables.
منابع مشابه
Self-Organizing Map in Data-Analysis - Notes on Overfitting and Overinterpretation
The Self-Organizing Map, SOM, is a widely used tool in exploratory data analysis. Visual inspection of the SOM can be used to list potential dependencies between variables, that are then validated with more principled statistical methods. In this paper we discuss the use of the SOM in searching for dependencies in the data. We point out that simple use of the SOM may lead to excessive number of...
متن کاملLandforms identification using neural network-self organizing map and SRTM data
During an 11 days mission in February 2000 the Shuttle Radar Topography Mission (SRTM) collected data over 80% of the Earth's land surface, for all areas between 60 degrees N and 56 degrees S latitude. Since SRTM data became available, many studies utilized them for application in topography and morphometric landscape analysis. Exploiting SRTM data for recognition and extraction of topographic ...
متن کاملGreen Product Consumers Segmentation Using Self-Organizing Maps in Iran
This study aims to segment the market based on demographical, psychological, and behavioral variables, and seeks to investigate their relationship with green consumer behavior. In this research, self-organizing maps are used to segment and to determine the features of green consumer behavior. This was a survey type of research study in which eight variables were selected from the demographical,...
متن کاملSteel Consumption Forecasting Using Nonlinear Pattern Recognition Model Based on Self-Organizing Maps
Steel consumption is a critical factor affecting pricing decisions and a key element to achieve sustainable industrial development. Forecasting future trends of steel consumption based on analysis of nonlinear patterns using artificial intelligence (AI) techniques is the main purpose of this paper. Because there are several features affecting target variable which make the analysis of relations...
متن کاملAvoiding overfitting in multilayer perceptrons with feeling-of-knowing using self-organizing maps.
Overfitting in multilayer perceptron (MLP) training is a serious problem. The purpose of this study is to avoid overfitting in on-line learning. To overcome the overfitting problem, we have investigated feeling-of-knowing (FOK) using self-organizing maps (SOMs). We propose MLPs with FOK using the SOMs method to overcome the overfitting problem. In this method, the learning process advances acco...
متن کامل